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DETAILED ACTION 

1. Claims 1-7, 9-11, 13, 15-22, and 24-44 have been examined. 

Papers Submitted 

2. It is hereby acknowledged that the following papers have been received and placed of 
record in the file: Amendment as received on 5/4/2009. 

Specification 

3. The amended title of the invention is not descriptive. A new title is required that is 
clearly indicative of the invention to which the claims are directed. The examiner recommends 
incorporating the concept of avoiding execution of instructions in the trailing thread by 
committing results obtained by the advanced thread to the register file used by the trailing thread. 

Claim Objections 

4. Claim 1 is objected to because of the following informalities: 

• In line 5, please replace "and" with —that--. 

• In the 2°'' line of the last paragraph, delete "and". 
Appropriate correction is required. 

5. Claim 7 is objected to because of the following informalities: In line 1, insert a dash 
between "load" and "ordering" for consistency (w/ claim 5). Appropriate correction is required. 

6. Claims 20-22 and 24-28 are objected to because there is some confusion as to what 
statutory category of invention the claim falls in. If applicant is trying to claim an article of 
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manufacture, then it is asked that applicant replace "An apparatus comprising a" with ~A~ such 
that the claim is clearly directed to a machine-readable storage medium containing instructions. 
This is the recommended course of action. If, however, applicant is trying to claim an apparatus, 
then the current claim is not a proper apparatus claim because it sets forth method steps. One 
way to resolve this is to state that the apparatus comprises a machine, or some other hardware, 
and the medium, and that the machine executes the instructions on the medium. Note that if 
applicant deletes the "apparatus" language from claim 20, it should also be deleted from the 
dependent claims. Appropriate correction is required. 

7. Claim 29 is objected to because of the following informalities: 

• In line 8, replace "and" with —that--. 

• In the 3'^'' to last line, replace "avoid" with —avoids--. 
Appropriate correction is required. 

8. Claim 37 is objected to because of the following informalities: In line 5, replace "and" 
with —that--. Appropriate correction is required. 

9. Claim 40 is objected to because of the following informalities: In line 1, replace "37" 
with ~39~. The current dependency is incorrect. Appropriate correction is required. 

10. Claim 42 is objected to because of the following informalities: In line 1, replace "37" 
with ~41~. The current dependency is incorrect. Appropriate correction is required. 

1 1 . Claim 43 is objected to because of the following informalities: 

• In line 1, please replace "37" with either —41— or ~42~, as the current 
dependency is incorrect. If modeled after claim 7, which is similar to claim 43, 
"37" should be replaced with ~42~. 
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• In line 1, insert a dash between "load" and "ordering" for consistency (w/ claim 
41). 

Appropriate correction is required. 

Claim Rejections - 35 USC §103 

12. The following is a quotation of 35 U.S. C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

13. Claims 1-7 and 9-10 are rejected under 35 U.S.C. 103(a) as being unpatentable over Nair 
et al, U.S. Patent No. 7,017,073 (herein referred to as Nair), in view of Hennessy and Patterson, 
"Computer Architecture - A Quantitative Approach, 2°'' Edition," 1996 (herein referred to as 
Hennessy), and further in view of Sundaramoorthy et al, "Slipstream Processors: Improving 
both Performance and Fault Tolerance", 2000 (herein referred to as Sundaramoorthy), and the 
examiner's taking of Official Notice. 

14. Referring to claim 1, Nair has taught an apparatus comprising: 

a) a first processor (Fig.l, component 1) and a second processor (Fig.l, component 2) each 
having a decoder. Note that since each processor receives and executes its own instructions, as 
shown in Fig.l, each processor inherently has a decoder, as instructions must be decoded before 
they are executed. 
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b) a plurality of memory devices coupled to the first processor and the second processor. See 
Fig.l, component 14, and column 4, line 6, to column 5, line 18. At least a PROM and cache are 
coupled to the processors. 

c) a second buffer coupled to the first processor and the second processor, the second buffer 
being a trace buffer. See Fig.l, at least component 12, which includes branch outcomes (trace 
information). 

d) wherein the first processor and the second processor perform single threaded apphcations 
using multithreading resources. See Fig. 2 and note that a single threaded application A is 
executed using multithreaded resources (i.e., simultaneous multithreaded (SMT) processors 0 
and 1). 

e) the first processor executes a single threaded application ahead of the second processor 
executing said single threaded application to avoid misprediction, and said single threaded 
application is not converted to an explicit multiple thread application. See column 4, lines 30-65, 
and note that thread A is executed ahead of A '. 

f) the single threaded application executed on the second processor avoids branch mispredictions 
from information received from said first processor. See column 4, lines 30-65, and Fig.l, and 
note that thread A is executed ahead of the other thread A ' so that branch outcomes may be 
passed to ^ ' to avoid misprediction. 

g) Nair has not explicitly taught that the first and second processors each have a scoreboard. 
However, Hennessy has taught that a scoreboard allows instructions to execute out of order. As 
is known in the art, out-of-order execution is advantageous because it allows instructions to 
execute as soon as their resources are ready, thereby reducing stalling and CPU idleness. See 
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Hennessy, pages 241 and 242. As a result, in order to allow both processors to benefit from such 
execution and resulting advantages, it would have been obvious to one of ordinary skill in the art 
at the time of the invention to modify each of the first and second processors of Nair to include 
scoreboards. 

h) Nair has not taught a first buffer coupled to the first processor and the second processor, the 
first buffer being a register buffer and is operable to transfer register values from the second 
processor to the first processor. However, Sundaramoorthy has taught the concept of passing 
register values, in addition to branch outcome values, between processors using a buffer (Fig.l, 
delay buffer) so that a trailing thread (R-stream) may utihze the values already computed by the 
leading thread (A-stream), thereby more efficiently executing the R-stream. See column 10, 
lines 17-21, and lines 35-38. As a result, in order to speed up execution of a trailing thread, it 
would have been obvious to one of ordinary skill in the art at the time of the invention to modify 
Nair to include a first buffer coupled to the first processor and the second processor, the first 
buffer being a register buffer and is operable to transfer register values from the second 
processor to the first processor. Based on the examiner's interpretation, Nair, as modified by 
Sundaramoorthy, would include four buffers between the processors. The four buffers would 
include buffers 12 and 13, as shown in Fig.l of Nair, and two additional buffers for passing 
register information from each processor, much like buffers 12 and 13. The first buffer would be 
the register buffer that passes data in the same direction as buffer 13 (from thread B to B'). 

i) Nair has not taught a plurality of memory instruction buffers coupled to the first processor and 
the second processor. However, Official Notice is taken that load buffers, store buffers, and 
reorder buffers, and their advantages, are well known and accepted in the art. Specifically, load 
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and store buffers buffer long latency memory operations so that the pipeline may continue to 
perform other work. Also, load and store buffers contribute to efficient out-of-order execution of 
loads and stores. Reorder buffers, on the other hand, are inherent in all out-of-order systems, 
because although instructions may execute out of order, they must be retired in order. The 
reorder buffer ensures that instructions are retired in order. As described above, out-of-order 
execution is advantageous because it allows instructions to execute as soon as their resources are 
ready, thereby reducing stalling and CPU idleness. Consequently, to reduce stalling, it would 
have been obvious to one of ordinary skill in the art at the time of the invention to implement at 
least load, store, and reorder buffers in the system of Nair. 

15. Referring to claim 2, Nair, as modified, has taught an apparatus as described in claim 1 . 
Nair has not explicitly taught that the memory devices comprise a plurality of cache devices. 
Instead, Nair has only taught a single L2 cache. However Official Notice is taken that LI caches 
are well known and accepted in the art, especially in systems that already have an L2 cache. An 
LI cache is faster than an L2 cache, thereby speeding up access to most recently accessed data. 
As a result, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to further modify Nair to include an LI cache. 

16. Referring to claim 3, Nair, as modified, has taught an apparatus as described in claim 1 . 
Nair has not taught that the first processor is coupled to at least one of a plurality of zero level 
(LO) data cache devices and at least one of a plurality of LO instruction cache devices, and the 
second processor is coupled to at least one of the plurality of LO data cache devices and at least 
one of the plurality of LO instruction cache devices. However, Sundaramoorthy has taught such 
a concept. See Fig. 1 and note that each processor is connected to a separate data cache (D- 
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Cache) and instruction (I-Cache) which can be considered as zero-level caches because they are 
directly connected to the execute cores). By having individual caches, bus contention would be 
reduced between processors since they wouldn't be fighting for the same cache. As a result, it 
would have been obvious to one of ordinary skill in the art at the time of the invention to further 
modify Nair such that the first processor is coupled to at least one of a plurality of zero level (LO) 
data cache devices and at least one of a plurality of LO instruction cache devices, and the second 
processor is coupled to at least one of the plurality of LO data cache devices and at least one of 
the plurality of LO instruction cache devices. 

17. Referring to claim 4, Nair, as modified, has taught an apparatus as described in claim 3, 
wherein each of the plurality of LO data cache devices store exact copies of store instruction data. 
Although this is not mentioned explicitly, it is deemed inherent to the design because as each 
processor is executing the same thread, the data caches in each processor must contain exact 
copies of data. And, this data is store instruction data because data that is stored to main memory 
is also stored in a data cache. 

18. Referring to claim 5, Nair, as modified, has taught an apparatus as described in claim 1, 
wherein the plurality of memory instruction buffers includes at least one store forwarding buffer 
and at least one load-ordering buffer (recall from the rejection of claim 1 that is would have been 
obvious to modify Nair to include a load buffer and a store buffer, which forwards stores to main 
memory). 

19. Referring to claim 6, Nair, as modified, has taught an apparatus as described in claim 5. 
Although Nair, as modified, has not explicitly taught that the at least one store forwarding buffer 
comprises a structure having a plurality of entries, each of the plurality of entries having a tag 
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portion, a validity portion, a data portion, a store instruction identification (ID) portion, and a 
thread ID portion, such fields are well known in the art and all relate to tracking a particular 
instruction and identifying data associated with that instructions. As a result, it would have been 
obvious to one of ordinary skill in the art at the time of the invention to further modify Nair to 
include these fields. 

20. Referring to claim 7, Nair, as modified, has taught an apparatus as described in claim 6. 
Although Nair, as modified, does not mention that the at least one load ordering buffer comprises 
a structure having a plurality of entries, each of the plurality of entries having a tag portion, an 
entry validity portion, a load identification (ID) portion, and a load thread ID portion, such fields 
are well known in the art and all relate to tracking a particular instruction and identifying data 
associated with that instructions. As a result, it would have been obvious to one of ordinary skill 
in the art at the time of the invention to further modify Nair to include these fields. 

21 . Referring to claim 9, Nair, as modified, has taught an apparatus as described in claim 1 . 
Nair, despite teaching trace queues 12, 13 in Fig.l, does not disclose that the trace buffer is a 
circular buffer having an array with head and tail pointers, the head and tail pointers having a 
wrap-around bit. However, "Official Notice" is taken that it is well known and expected in the 
art to implement a FIFO queue as a circular buffer with head and tail pointers wherein head and 
tail pointers have a wrap-around bit. A circular buffer is useful to implement in hardware 
because only the head and tail pointers need to be incremented/decremented instead of actually 
physically shifting entries. A wrap around bit would also be needed to indicate whether the 
pointer has wrapped around the end of the queue. Therefore, it would been obvious to one of 
ordinary skill in the art at the time of the invention to have implemented the FIFO queue as a 
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circular buffer with head and tail pointers, the head and tail pointers having a wrap around bit 
because it is known that a FIFO queue can be implemented as a circular buffer and it is easier to 
build in hardware. 

22. Referring to claim 10, Nair, as modified, has taught an apparatus as described in claim 1 . 
Nair, as modified, has not explicitly taught that the register buffer comprising an integer register 
buffer and a predicate register buffer. However, Official Notice is taken that integer registers 
and predicate registers are well known and expected in the art. By implementing integer 
registers, the system will be able to load and store integer data and perform integer operations 
quickly. Furthermore, by implementing predicate registers, the system will be able to achieve 
conditional execution of instructions without conditional branch instructions. Consequently, to 
achieve such functionality, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify Nair to include an integer register buffer and a predicate register 
buffer in the register buffer (delay buffer). 



Allowable Subject Matter 

23. Claims 11, 13, 15-22, and 24-44 are allowed. The examiner asserts that after 
reconsideration, the prior art of record has not taught, nor rendered obvious, the transmitting 
results from the second processor to the first processor, the first processor avoiding executing a 
portion of instructions by committing the results of the portion of instructions into a register file 
from a first buffer , the first buffer being a trace buffer. 

Please correct any objections associated with these claims. 
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Response to Arguments 

24. Applicant's arguments filed on May 4, 2009, have been fully considered but they are not 
persuasive. 

25. Applicant argues the novelty/rejection of claim 1 on pages 15-16 of the remarks, in 
substance that: 

"Applicants would like to point out that Nair '073 claims priority to a provisional application 
60/272,138 filed on Feb 28, 2001 (hereinafter Nair '138). Nair '073 and Nair '138 do not have the 
same specification. The filing date of Nair '073, Feb 27, 2002, is later than the filing date of 
Applicants' application (i.e., June 28, 2001). Applicants submit that it is improper to use Nair '073 
as a reference in the Office Action." 

26. These arguments are not found persuasive for the following reasons: 

a) The examiner has reviewed the provisional application ('138) to which Nair '073 claims 
priority. While the specifications are not exactly the same, MPEP 201.1 1 does not require that 
the specifications be the exact same. ". . .for a claim in a later filed nonprovisional application to 
be entitled to the benefit of the filing date of the provisional application, the written description 
and drawing(s) (if any) of the provisional application must adequately support and enable the 
subject matter of the claim in the later filed nonprovisional application." The examiner asserts 
that the subject matter relied upon in Nair '073 to reject claim 1 is adequately supported and 
enabled by the '138. Therefore, the priority date for Nair is applicable in this situation. 

27. Applicant argues the novelty/rejection of claim 1 on page 16 of the remarks, in substance 
that: 

"Nair is silent about disclosing 'a plurality of memory devices coupled to the first 
processor and the second processor.'" 

28. These arguments are not found persuasive for the following reasons: 
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a) As stated in the rejections above, see column 4, line 66, to column 5, line 6. At least multiple 
caches are coupled to the processors. 



29. Applicant argues the novelty/rejection of claim 1 on page 17 of the remarks, in substance 
that: 

"Sundaramoorthy discloses a multiprocessor system that executes two (i.e., multiple 
streams/threads) pseudo-redundant programs on separate processors on the same chip. Claim 1 
requires "a first buffer coupled to the first processor and the second processor, the first buffer 
being a register buffer and is operable to transfer register values from the second processor to 
the first processor", where the first processor executes a single threaded application ahead of the 
second processor as recited in claim 1 . The Office Action alleges that "a first buffer" is disclosed 
by the delay buffer in Sundaramoorthyin (Fig. 1 and col. 10, lines 17-21). Applicants respectfully 
disagree. Sundaramoorthy states that "the delay buffer is a simple FIFO queue that allows the A- 
stream to communicate control flow and data flow outcomes to the R-stream." Sundaramoorthy 
defines that "the leading program is called the advanced stream, or A-stream, and the trailing 
program is called the redundant stream, or R- stream." (col. 2, lines 4-6). Sundaramoorthy 
discloses a delay buffer that allows the advance stream to communicate control flow and data 
flow from A-stream to R-stream (not the other way around). In short, a delay buffer is not the first 
buffer as claimed in claim 1. Sundaramoorthy fails to disclose the limitation as required. 
Sundaramoorthy also fails to mention "a plurality of memory devices coupled to the first 
processor and the second processor".'" 

30. These arguments are not found persuasive for the following reasons: 

a) As stated in the rejection, Sundaramoorthy has taught the concept of passing register values, in 
addition to branch outcome values, between processors using a buffer (Fig.l, delay buffer) so 
that a trailing thread (R-stream) may utilize the values already computed by the leading thread 
(A-stream), thereby more efficiently executing the R-stream. See column 10, lines 17-21, and 
lines 35-38. As a result, in order to speed up execution of a trailing thread, it would have been 
obvious to one of ordinary skill in the art at the time of the invention to modify Nair to include a 
first buffer coupled to the first processor and the second processor, the first buffer being a 
register buffer and is operable to transfer register values from the second processor to the first 
processor. Based on the examiner's interpretation, Nair, as modified by Sundaramoorthy, would 
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include four buffers between the processors. The four buffers would include buffers 12 and 13, 
as shown in Fig.l of Nair, and two additional buffers for passing register information from each 
processor, much like buffers 12 and 13. The first buffer would be the register buffer that passes 
data in the same direction as buffer 13 (from thread B to g') . Hence, it should be noted that the 
first buffer does not pass register values from the first to the second processor, as applicant 
argues, but passes values from the second processor to the first processor (for thread B' from B). 



3 1 . Applicant argues the novelty/rejection of claim 1 on pages 1 7- 1 8 of the remarks, in 
substance that: 

"Moreover, it is asserted in tlie Office Action it would be obvious to combine Nair' 13 8 
witli Sundaramoortliy in order to speed up execution of a trailing thread (Office Action, page 7, 
lines 9-10). As explained above, however, check thread (A'), which is the trailing thread, naturally 
tend to run more slowly (Nair '138, page 2, paragraph 3). As stated in MPEP 2143.01 (VI), the 
proposed modification can not change the principle of operation of a reference. "If the proposed 
modification or combination of the prior art would change the principle of operation of the prior art 
invention being modified, then the teachings of the references are not sufficient to render the 
claims prima facie obvious." Applicants submit that the combination proposed in the Office Action 
contradicts to the principle of Nair '138. Therefore, the combination fails to establish a prima facie 
case with respect to claim 1 ." 

32. These arguments are not found persuasive for the following reasons: 

a) Speeding up the trailing thread does nothing to contradict the principle of Nair '138. That is, 
taking the examples of threads B and B', since B is the foreground thread and B' is a background 
thread, B' will naturally run slower than B because B' must wait as foreground thread A executes 
with priority. This is the purpose for the delay buffer 8. Because B instructions finish sooner 
than corresponding B' instructions, the results cannot be checked right away. The delay buffer is 
used to hold B instruction results until the corresponding B' results are produced. Clearly, the 
more B' lags, the larger the buffer must be. As a result, one would still see benefit in speeding 
up B'. The examiner is not saying that it would be obvious to make B' faster than B, as that 
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would clearly be contrary to the teachings of Nair. Instead, the examiner believes that it is 
obvious to try and speed up B' while still having it lag behind B. One potential advantage of this 
is to decrease the size of the delay buffer. Also, cross-checking would be completed sooner. 

Conclusion 

33. THIS ACTION IS MADE FINAL. Apphcant is reminded of the extension of time 
policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to DAVID J. HUISMAN whose telephone number is (571)272- 
4168. The examiner can normally be reached on Monday-Friday (8:00-4:30). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Eddie Chan can be reached on (571) 272-4162. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 



Application/Control Number: 09/896,526 Page 15 

Art Unit: 2183 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



/David J. Huisman/ 

Primary Examiner, Art Unit 2183 



